Emotional voice conversion: Theory, databases and ESD
نویسندگان
چکیده
In this paper, we first provide a review of the state-of-the-art emotional voice conversion research, and existing speech databases. We then motivate development novel database (ESD) that addresses increasing research need. With ESD database1 is now made available to community. The consists 350 parallel utterances spoken by 10 native English Chinese speakers covers 5 emotion categories (neutral, happy, angry, sad surprise). More than 29 h data were recorded in controlled acoustic environment. suitable for multi-speaker cross-lingual studies. As case studies, implement several systems on database. This paper provides reference study conjunction with its release.
منابع مشابه
GMM-based voice conversion applied to emotional speech synthesis
Voice conversion method is applied to synthesizing emotional speech from standard reading (neutral) speech. Pairs of neutral speech and emotional speech are used for conversion rule training. The conversion adopts GMM (Gaussian Mixture Model) with DFW (Dynamic Frequency Warping). We also adopt STRAIGHT, the high-quality speech analysis-synthesis algorithm. As conversion target emotions, (Hot) a...
متن کاملSyllabic Pitch Tuning for Neutral-to-emotional Voice Conversion
Prosody plays an important role in neutral-to-emotional voice conversion. Prosodic features like pitch are usually estimated and altered at a segmental level based on short windowing of speech signal (where the signal is expected to be quasi-stationary). This results in a frame-wise change of acoustical parameters for synthesizing emotionalized speech. In order to convert a neutral speech to an...
متن کاملEmotional Speech Synthesis Based on Improved Codebook Mapping Voice Conversion
This paper presents a spectral transformation method for emotional speech synthesis based on voice conversion framework. Three emotions are studied, including anger, happiness and sadness. For the sake of high naturalness, superior speech quality and emotion expressiveness, our original STASC system is modified by introducing a new feature selection strategy and hierarchical codebook mapping pr...
متن کاملA comparison of voice conversion methods for transforming voice quality in emotional speech synthesis
This paper presents a comparison of methods for transforming voice quality in neutral synthetic speech to match cheerful, aggressive, and depressed expressive styles. Neutral speech is generated using the unit selection system in the MARY TTS platform and a large neutral database in German. The output is modified using voice conversion techniques to match the target expressive styles, the focus...
متن کاملVoice Conversion
Voice conversion (VC) is an area of speech processing that deals with the conversion of the perceived speaker identity. In other words, the speech signal uttered by a first speaker, the source speaker, is modified to sound as if it was spoken by a second speaker, referred to as the target speaker. The most obvious use case for voice conversion is text-to-speech (TTS) synthesis where VC techniqu...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Speech Communication
سال: 2022
ISSN: ['1872-7182', '0167-6393']
DOI: https://doi.org/10.1016/j.specom.2021.11.006